Task: Information Systems: Data
This chapter describes the Data Architecture part of Phase C.
Purpose

The objective here is to define the major types and sources of data necessary to support the business, in a way that is:

  • Understandable by stakeholders
  • Complete and consistent
  • Stable

It is important to note that this effort is not concerned with database design. The goal is to define the data entities relevant to the enterprise, not to design logical or physical storage systems. (However, linkages to existing files and databases may be developed, and may demonstrate significant areas for improvement.)

Relationships
Main Description

Approach

Key Considerations for Data Architecture

Data Management

When an enterprise has chosen to undertake largescale architectural transformation, it is important to understand and address data management issues. A structured and comprehensive approach to data management enables the effective use of data to capitalize on its competitive advantages.

Considerations include:

  • A clear definition of which application components in the landscape will serve as the system of record or reference for enterprise master data
  • Will there be an enterprise-wide standard that all application components, including software packages, need to adopt (in the main packages can be prescriptive about the data models and may not be flexible)?
  • Clearly understand how data entities are utilized by business functions, processes, and services
  • Clearly understand how and where enterprise data entities are created, stored, transported, and reported
  • What is the level and complexity of data transformations required to support the information exchange needs between applications?
  • What will be the requirement for software in supporting data integration with the enterprise's customers and suppliers (e.g., use of ETL tools during the data migration, data profiling tools to evaluate data quality, etc.)?
Data Migration

When an existing application is replaced, there will be a critical need to migrate data (master, transactional, and reference) to the new application. The Data Architecture should identify data migration requirements and also provide indicators as to the level of transformation, weeding, and cleansing that will be required to present data in a format that meets the requirements and constraints of the target application. The objective being that the target application has quality data when it is populated. Another key consideration is to ensure that an enterprise-wide common data definition is established to support the transformation.

Data Governance

Data governance considerations ensure that the enterprise has the necessary dimensions in place to enable the transformation, as follows:

  • Structure: This dimension pertains to whether the enterprise has the necessary organizational structure and the standards bodies to manage data entity aspects of the transformation.
  • Management System: Here enterprises should have the necessary management system and data-related programs to manage the governance aspects of data entities throughout its lifecycle.
  • People: This dimension addresses what data-related skills and roles the enterprise requires for the transformation. If the enterprise lacks such resources and skills, the enterprise should consider either acquiring those critical skills or training existing internal resources to meet the requirements through a well-defined learning program.

Architecture Repository

As part of this phase, the architecture team will need to consider what relevant Data Architecture resources are available in the organization's Architecture Repository (see Part V, Architecture Repository), in particular, generic data models relevant to the organization's industry "vertical" sector. For example:

  • ARTS has defined a data model for the Retail industry.
  • Energistics has defined a data model for the Petrotechnical industry.
Steps
Select Reference Models, Viewpoints, and Tools

Review and validate (or generate, if necessary) the set of data principles. These will normally form part of an overarching set of architecture principles. Guidelines for developing and applying principles, and a sample set of data principles, are given in Part III, Architecture Principles .

Select relevant Data Architecture resources (reference models, patterns, etc.) on the basis of the business drivers, stakeholders, concerns, and Business Architecture.

Select relevant Data Architecture viewpoints (for example, stakeholders of the data - regulatory bodies, users, generators, subjects, auditors, etc.; various time dimensions - real-time, reporting period, event-driven, etc.; locations; business processes); i.e., those that will enable the architect to demonstrate how the stakeholder concerns are being addressed in the Data Architecture.

Identify appropriate tools and techniques (including forms) to be used for data capture, modeling, and analysis, in association with the selected viewpoints. Depending on the degree of sophistication warranted, these may comprise simple documents or spreadsheets, or more sophisticated modeling tools and techniques such as data management models, data models, etc. Examples of data modeling techniques are:

  • Entity-relationship diagram
  • Class diagrams
  • Object role modeling
Determine Overall Modeling Process

For each viewpoint, select the models needed to support the specific view required, using the selected tool or method.

Examples of data models include:

  • The Department of Defense Architecture Framework (DoDAF) Logical Data Model
  • ARTS Data Model for the Retail industry
  • Energistics Data Model for the Petrotechnical industry

Ensure that all stakeholder concerns are covered. If they are not, create new models to address concerns not covered, or augment existing models (see above).

The recommended process for developing a Data Architecture is as follows:

  • Collect data-related models from existing Business Architecture and Application Architecture materials
  • Rationalize data requirements and align with any existing enterprise data catalogs and models; this allows the development of a data inventory and entity relationship
  • Update and develop matrices across the architecture by relating data to business service, business function, access rights, and application
  • Elaborate Data Architecture views by examining how data is created, distributed, migrated, secured, and archived
Identify Required Catalogs of Data Building Blocks

The organization's data inventory is captured as a catalog within the Architecture Repository. Catalogs are hierarchical in nature and capture a decomposition of a metamodel entity and also decompositions across related model entities (e.g., logical data component -> physical data component ->] data entity).

Catalogs form the raw material for development of matrices and diagrams and also act as a key resource for portfolio managing business and IT capability.

During the Business Architecture phase, a Business Service/Information diagram was created showing the key data entities required by the main business services. This is a prerequisite to successful Data Architecture activities.

Using the traceability from application to business function to data entity inherent in the content framework, it is possible to create an inventory of the data needed to be in place to support the Architecture Vision.

Once the data requirements are consolidated in a single location, it is possible to refine the data inventory to achieve semantic consistency and to remove gaps and overlaps.

The following catalogs should be considered for development within a Data Architecture:

  • Data Entity/Data Component catalog

The structure of catalogs is based on the attributes of metamodel entities, as defined in Part IV, Content Metamodel .

Identify Required Matrices

Matrices show the core relationships between related model entities.

Matrices form the raw material for development of diagrams and also act as a key resource for impact assessment.

At this stage, an entity to application systems matrix could be produced to validate this mapping. How data is created, maintained, transformed, and passed to other applications, or used by other applications, will now start to be understood. Obvious gaps such as entities that never seem to be created by an application or data created but never used, need to be noted for later gap analysis.

The rationalized data inventory can be used to update and refine the architectural diagrams of how data relates to other aspects of the architecture.

Once these updates have been made, it may be appropriate to drop into a short iteration of Application Architecture to resolve the changes identified.

The following matrices should be considered for development within a Data Architecture:

  • Data Entity/Business Function (showing which data supports which functions and which business function owns which data)
  • Business Service/Information (developed during the Business Architecture phase)
  • System/Data (developed across the Application Architecture and Data Architecture phases)

The structure of matrices is based on the attributes of metamodel entities, as defined in Part IV, Content Metamodel .

Identify Required Diagrams

Diagrams present the Data Architecture information from a set of different perspectives (viewpoints) according to the requirements of the stakeholders.

Once the data entities have been refined, a diagram of the relationships between entities and their attributes can be produced.

It is important to note at this stage that information may be a mixture of enterprise-level data (from system service providers and package vendor information) and local-level data held in personal databases and spreadsheets.

The level of detail modeled needs to be carefully assessed. Some physical system data models will exist down to a very detailed level; others will only have core entities modeled. Not all data models will have been kept up-to-date as applications were modified and extended over time. It is important to achieve a balance in the level of detail provided (e.g., reproducing existing detailed system physical data schemas or presenting high-level process maps and data requirements, highlight the two extreme views).

The following diagrams should be considered for development within a Data Architecture:

  • Class diagram
  • Data Dissemination diagram
  • Data Lifecycle diagram
  • Data Security diagram
  • Data Migration diagram
  • Class Hierarchy diagram
Identify Types of Requirement to be Collected

Once the Data Architecture catalogs, matrices, and diagrams have been developed, architecture modeling is completed by formalizing the data-focused requirements for implementing the Target Architecture.

Within this step, the architecture engagement should identify types of requirement that must be met by the architecture implementation, including:

  • Functional requirements
  • Non-functional requirements
  • Assumptions
  • Constraints
  • Domain-specific Data Architecture principles
  • Policies
  • Standards
  • Guidelines
  • Specifications
Develop Baseline Data Architecture Description

Develop a Baseline Description of the existing Data Architecture, to the extent necessary to support the Target Data Architecture. The scope and level of detail to be defined will depend on the extent to which existing data elements are likely to be carried over into the Target Data Architecture, and on whether architectural descriptions exist, as described in Approach. To the extent possible, identify the relevant Data Architecture building blocks, drawing on the Architecture Repository (see Part V, Architecture Repository).

Where new architecture models need to be developed to satisfy stakeholder concerns, use the models identified within Step 1 as a guideline for creating new architecture content to describe the Baseline Architecture.

Develop Target Data Architecture Description

Develop a Target Description for the Data Architecture, to the extent necessary to support the Architecture Vision and Target Business Architecture. The scope and level of detail to be defined will depend on the relevance of the data elements to attaining the Target Architecture, and on whether architectural descriptions exist. To the extent possible, identify the relevant Data Architecture building blocks, drawing on the Architecture Repository (see Part V, Architecture Repository).

Where new architecture models need to be developed to satisfy stakeholder concerns, use the models identified within Step 1 as a guideline for creating new architecture content to describe the Target Architecture.

Perform Gap Analysis

First, verify the architecture models for internal consistency and accuracy.

Note changes to the viewpoint represented in the selected models from the Architecture Repository, and document.

Test architecture models for completeness against requirements.

Identify gaps between the baseline and target:

  • Create gap matrix, as described in Part III, Gap Analysis
  • Identify building blocks to be carried over, classifying as either changed or unchanged
  • Identify eliminated building blocks
  • Identify new building blocks
  • Identify gaps and classify as those that should be developed and those that should be procured
Define Roadmap Components

Following creation of a Baseline Architecture, Target Architecture, and gap analysis, a data roadmap is required to prioritize activities over the coming phases.

This initial Data Architecture roadmap will be used as raw material to support more detailed definition of a consolidated, cross-discipline roadmap within the Opportunities & Solutions phase.

Resolve Impacts Across the Architecture Landscape

Once the Data Architecture is finalized, it is necessary to understand any wider impacts or implications.

At this stage, other architecture artifacts in the Architecture Landscape should be examined to identify:

  • Does this Data Architecture create an impact on any pre-existing architectures?
  • Have recent changes been made that impact the Data Architecture?
  • Are there any opportunities to leverage work from this Data Architecture in other areas of the organization?
  • Does this Data Architecture impact other projects (including those planned as well as those currently in progress)?
  • Will this Data Architecture be impacted by other projects (including those planned as well as those currently in progress)?
Conduct Formal Stakeholder Review

Check the original motivation for the architecture project and the Statement of Architecture Work against the proposed Data Architecture. Conduct an impact analysis to identify any areas where the Business and Application Architectures (e.g., business practices) may need to change to cater for changes in the Data Architecture (for example, changes to forms or procedures, application systems, or database systems).

If the impact is significant, this may warrant the Business and Application Architectures being revisited.

Identify any areas where the Application Architecture (if generated at this point) may need to change to cater for changes in the Data Architecture (or to identify constraints on the Application Architecture about to be designed).

If the impact is significant, it may be appropriate to drop into a short iteration of the Application Architecture at this point.

Identify any constraints on the Technology Architecture about to be designed, refining the proposed Data Architecture only if necessary.

Finalize the Data Architecture
  • Select standards for each of the building blocks, re-using as much as possible from the reference models selected from the Architecture Repository
  • Fully document each building block
  • Conduct final cross-check of overall architecture against business requirements; document rationale for building block decisions in the architecture document
  • Document final requirements traceability report
  • Document final mapping of the architecture within the Architecture Repository; from the selected building blocks, identify those that might be re-used, and publish via the Architecture Repository
  • Finalize all the work products, such as gap analysis
Create Architecture Definition Document

Document rationale for building block decisions in the Architecture Definition Document.

Prepare Data Architecture sections of the Architecture Definition Document, comprising some or all of:

  • Business data model
  • Logical data model
  • Data management process model
  • Data Entity/Business Function matrix
  • Data interoperability requirements (e.g., XML schema, security policies)
  • If appropriate, use reports and/or graphics generated by modeling tools to demonstrate key views of the architecture; route the document for review by relevant stakeholders, and incorporate feedback
More Information